Scalable Self-Tuning Implementation of Smith-Waterman Algorithm for Multicore CPUs
نویسندگان
چکیده
Improved version of the Smith-Waterman algorithm (SWA) is most widely used for local alignment of a pattern (or query) sequence with a Database (DB) sequence. This dynamicprogramming algorithm is computation intensive. To reduce time for computing alignment score matrix, parallel versions have been implemented on GPUs and multicore CPUs. These parallel versions have shown significant speedup when compared with their corresponding sequential versions. Our initial evaluation of an OpenMP parallelization of SWA has shown linear speedup on multicore CPUs, but a closer look at performance data from both sequential and parallel versions have revealed two undesired effects: (i) As the length of the DB sequence increases, the number of elements of the alignment score matrix H computed in per unit time initially increases, then reaches a maximum, and finally decreases continuously; (ii) the length of the DB sequence where decline starts is different for different CPUs. To overcome the computation rate decline we have proposed a run-time self-tuning algorithm. It determines the length, l, of a DB sequence that maximize computation rate during execution time. Then, divide computation of H into computation of a set of submatrices, such that the number of columns in each submatrix is about l. Our study also found that the number of per-core-threads that delivers the highest rate of computation differs from CPU to CPU. Our proposed algorithm determines optimal number of threads during execution time and creates optimal number of threads for highest possible computation rate. Our extensive evaluations of the proposed self-tuning algorithm on three different multicore multi-CPU shared memory machines have shown significant performance improvement.
منابع مشابه
High-fidelity simulation of collective effects in electron beams using an innovative parallel method
Among the most challenging and heretofore unsolved problems in accelerator physics is accurate simulation of the collective effects in electron beams. Electron beam dynamics is crucial in understanding and the design of: (i) high-brightness synchrotron light sources — powerful tools for cutting-edge research in physics, biology, medicine and other fields, and (ii) electron-ion particle collider...
متن کاملAn Implementation of the Tile QR Factorization for a GPU and Multiple CPUs
The tile QR factorization provides an efficient and scalable way for factoring a dense matrix in parallel on multicore processors. This article presents a way of efficiently implementing the algorithm on a system with a powerful GPU and many multicore CPUs.
متن کاملRevisiting the Speed-versus-Sensitivity Tradeoff in Pairwise Sequence Search
The Smith-Waterman algorithm is a dynamic programming method for determining optimal local alignments between nucleotide or protein sequences. However, it suffers from quadratic time and space complexity. As a result, many algorithmic and architectural enhancements have been proposed to solve this problem, but at the cost of reduced sensitivity in the algorithms or significant expense in hardwa...
متن کاملA parallel and sensitive software tool for methylation analysis on multicore platforms
MOTIVATION DNA methylation analysis suffers from very long processing time, as the advent of Next-Generation Sequencers has shifted the bottleneck of genomic studies from the sequencers that obtain the DNA samples to the software that performs the analysis of these samples. The existing software for methylation analysis does not seem to scale efficiently neither with the size of the dataset nor...
متن کاملParaXML: A Parallel XML Processing Model on the Multicore CPUs
XML has emerged as the de facto standard interoperable data format for the web service, the database and document processing systems. The processing of the XML documents, however, has been recognized as the performance bottleneck in those systems; as a result the demand for highperformance XML processing grows rapidly. On the hardware front, the multicore processor is increasingly becoming avai...
متن کامل